In-class Exercise 4: 2nd Order Spatial Point Patterns Analysis Methods
1 Introduction
This exercise looks into the accessibility of water in Osun, a state located in southwestern Nigeria. Crucially, we are analysing how are the water points are located and if the locations of these water points hint at their functionality.
1.1 Dataset Sources
WPdx+ Dataset (CSV) - Taken from WPdx Global Data Repositories. It shows the locations of the water points.
State boundary GIS Datasets of Nigeria - Taken from Humanitarian Data Exchange. It gives geospatial data of Nigeria, in particular the boundaries of its states and Local Government Areas (LGA).
2 Import R Packages
3 Import Datasets
3.1 WPdx+ Dataset
Since we are only analysing Osun’s water points, we will directly filter the water points by the country and state.
# A tibble: 6 × 70
row_id #sour…¹ #lat_…² #lon_…³ #repo…⁴ #stat…⁵ #wate…⁶ #wate…⁷ #wate…⁸ #wate…⁹
<dbl> <chr> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 429123 GRID3 8.02 5.06 08/29/… Unknown <NA> <NA> Tapsta… Tapsta…
2 70566 Federa… 7.32 4.79 05/11/… No Protec… Well Mechan… Mechan…
3 70578 Federa… 7.76 4.56 05/11/… No Boreho… Well Mechan… Mechan…
4 66401 Federa… 8.03 4.64 04/30/… No Boreho… Well Mechan… Mechan…
5 422190 GRID3 7.87 4.88 08/29/… Unknown <NA> <NA> Tapsta… Tapsta…
6 422064 GRID3 7.7 4.89 08/29/… Unknown <NA> <NA> Tapsta… Tapsta…
# … with 60 more variables: `#facility_type` <chr>,
# `#clean_country_name` <chr>, `#clean_adm1` <chr>, `#clean_adm2` <chr>,
# `#clean_adm3` <chr>, `#clean_adm4` <chr>, `#install_year` <dbl>,
# `#installer` <chr>, `#rehab_year` <lgl>, `#rehabilitator` <lgl>,
# `#management_clean` <chr>, `#status_clean` <chr>, `#pay` <chr>,
# `#fecal_coliform_presence` <chr>, `#fecal_coliform_value` <dbl>,
# `#subjective_quality` <chr>, `#activity_id` <chr>, `#scheme_id` <chr>, …
3.2 Nigeria Osun State
As the geospatial data of Nigeria is being imported as a Simple Feature DataFrame, we want to ensure that the dataframe is projected in the right ESPG codes (i.e., 26391, 26392, 26393).
NGA <- st_read(dsn = "data/geospatial/nga_adm_osgof_20190417",
layer = "nga_admbnda_adm2_osgof_20190417") %>%
st_transform(crs = 26392)Reading layer `nga_admbnda_adm2_osgof_20190417' from data source
`C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex01\data\geospatial\nga_adm_osgof_20190417'
using driver `ESRI Shapefile'
Simple feature collection with 774 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS: WGS 84
Simple feature collection with 5 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 481088 ymin: 98142.39 xmax: 1248985 ymax: 1079710
Projected CRS: Minna / Nigeria Mid Belt
Shape_Leng Shape_Area ADM2_EN ADM2_PCODE ADM2_REF ADM2ALT1EN ADM2ALT2EN
1 0.2370744 0.001523921 Aba North NG001001 Aba North <NA> <NA>
2 0.2624772 0.003531104 Aba South NG001002 Aba South <NA> <NA>
3 3.0753158 0.326867840 Abadam NG008001 Abadam <NA> <NA>
4 2.5379842 0.068378506 Abaji NG015001 Abaji <NA> <NA>
5 0.6871498 0.014528691 Abak NG003001 Abak <NA> <NA>
ADM1_EN ADM1_PCODE ADM0_EN ADM0_PCODE date validOn
1 Abia NG001 Nigeria NG 2016-11-29 2019-04-17
2 Abia NG001 Nigeria NG 2016-11-29 2019-04-17
3 Borno NG008 Nigeria NG 2016-11-29 2019-04-17
4 Federal Capital Territory NG015 Nigeria NG 2016-11-29 2019-04-17
5 Akwa Ibom NG003 Nigeria NG 2016-11-29 2019-04-17
validTo SD_EN SD_PCODE geometry
1 <NA> Abia South NG00103 MULTIPOLYGON (((548795.5 11...
2 <NA> Abia South NG00103 MULTIPOLYGON (((547286.1 11...
3 <NA> Borno North NG00802 MULTIPOLYGON (((1248985 104...
4 <NA> Federal Capital Territory NG01501 MULTIPOLYGON (((510864.9 57...
5 <NA> Akwa Ibom North West NG00302 MULTIPOLYGON (((594269 1209...
4 Data Handling
4.1 WPdx+ Dataset
Here, st_as_sfc() converts the column “New Georeferenced Column” in the WPdx+ dataset, which references the water points’ locations, into a Simple Feature geometry Column.
# A tibble: 5,557 × 71
row_id `#source` #lat_…¹ #lon_…² #repo…³ #stat…⁴ #wate…⁵ #wate…⁶ #wate…⁷
<dbl> <chr> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr>
1 429123 GRID3 8.02 5.06 08/29/… Unknown <NA> <NA> Tapsta…
2 70566 Federal Minis… 7.32 4.79 05/11/… No Protec… Well Mechan…
3 70578 Federal Minis… 7.76 4.56 05/11/… No Boreho… Well Mechan…
4 66401 Federal Minis… 8.03 4.64 04/30/… No Boreho… Well Mechan…
5 422190 GRID3 7.87 4.88 08/29/… Unknown <NA> <NA> Tapsta…
6 422064 GRID3 7.7 4.89 08/29/… Unknown <NA> <NA> Tapsta…
7 65607 Federal Minis… 7.89 4.71 05/12/… No Boreho… Well Mechan…
8 68989 Federal Minis… 7.51 4.27 05/07/… No Boreho… Well <NA>
9 67708 Federal Minis… 7.48 4.35 04/29/… Yes Boreho… Well Mechan…
10 66419 Federal Minis… 7.63 4.50 05/08/… Yes Boreho… Well Hand P…
# … with 5,547 more rows, 62 more variables: `#water_tech_category` <chr>,
# `#facility_type` <chr>, `#clean_country_name` <chr>, `#clean_adm1` <chr>,
# `#clean_adm2` <chr>, `#clean_adm3` <chr>, `#clean_adm4` <chr>,
# `#install_year` <dbl>, `#installer` <chr>, `#rehab_year` <lgl>,
# `#rehabilitator` <lgl>, `#management_clean` <chr>, `#status_clean` <chr>,
# `#pay` <chr>, `#fecal_coliform_presence` <chr>,
# `#fecal_coliform_value` <dbl>, `#subjective_quality` <chr>, …
4.1.0.1 Create Simple Feature DataFrame
st_sf() then converts wp_osun to a Simple Feature DataFrame.
Simple feature collection with 5557 features and 70 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 4.032004 ymin: 7.060309 xmax: 5.06 ymax: 8.061898
Geodetic CRS: WGS 84
# A tibble: 5,557 × 71
row_id `#source` #lat_…¹ #lon_…² #repo…³ #stat…⁴ #wate…⁵ #wate…⁶ #wate…⁷
* <dbl> <chr> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr>
1 429123 GRID3 8.02 5.06 08/29/… Unknown <NA> <NA> Tapsta…
2 70566 Federal Minis… 7.32 4.79 05/11/… No Protec… Well Mechan…
3 70578 Federal Minis… 7.76 4.56 05/11/… No Boreho… Well Mechan…
4 66401 Federal Minis… 8.03 4.64 04/30/… No Boreho… Well Mechan…
5 422190 GRID3 7.87 4.88 08/29/… Unknown <NA> <NA> Tapsta…
6 422064 GRID3 7.7 4.89 08/29/… Unknown <NA> <NA> Tapsta…
7 65607 Federal Minis… 7.89 4.71 05/12/… No Boreho… Well Mechan…
8 68989 Federal Minis… 7.51 4.27 05/07/… No Boreho… Well <NA>
9 67708 Federal Minis… 7.48 4.35 04/29/… Yes Boreho… Well Mechan…
10 66419 Federal Minis… 7.63 4.50 05/08/… Yes Boreho… Well Hand P…
# … with 5,547 more rows, 62 more variables: `#water_tech_category` <chr>,
# `#facility_type` <chr>, `#clean_country_name` <chr>, `#clean_adm1` <chr>,
# `#clean_adm2` <chr>, `#clean_adm3` <chr>, `#clean_adm4` <chr>,
# `#install_year` <dbl>, `#installer` <chr>, `#rehab_year` <lgl>,
# `#rehabilitator` <lgl>, `#management_clean` <chr>, `#status_clean` <chr>,
# `#pay` <chr>, `#fecal_coliform_presence` <chr>,
# `#fecal_coliform_value` <dbl>, `#subjective_quality` <chr>, …
Like in importing the geospatial data of Nigeria Osun State, st_transform() is used to re-project the geographic coordinate system to projected coordinate system as the projected coordinate system allows for better analysis involving measurements.
5 Geospatial Data Cleaning
5.1 Filtering Redundant Fields
Looking at Nigeria’s state boundary Dataset, there are many fields and rows that not necessary for our project. Hence, we will filter for those of Osun and select only “ADM1_EN” and “ADM2_EN” fields which will tell us the 1st and 2nd level administrative zones.
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 189625.6 ymin: 338755.8 xmax: 272238.5 ymax: 447175.5
Projected CRS: Minna / Nigeria Mid Belt
ADM2_EN ADM2_PCODE ADM1_EN ADM1_PCODE geometry
1 Aiyedade NG030001 Osun NG030 MULTIPOLYGON (((213526.6 34...
2 Aiyedire NG030002 Osun NG030 MULTIPOLYGON (((212542.6 40...
3 Atakumosa East NG030003 Osun NG030 MULTIPOLYGON (((265746.8 37...
4 Atakumosa West NG030004 Osun NG030 MULTIPOLYGON (((248871.4 40...
5 Boluwaduro NG030005 Osun NG030 MULTIPOLYGON (((266092.2 43...
5.2 Checking for Duplicated Name
You can see that no duplicated LGAs in the Osun state.
5.3 Excluding Unnecessary Data Points
st_intersection() is particularly chosen to exclude cooridinate points, where the water points’ locations actually overlap with the Osun state. If we are to use wp_sf as it is, we may include water points not in Osun due to data errors.
6 Data Wrangling for Water Point Data
As depicted in the frequency graph below, we can take note that there are rows with no input data “NA”. However, the rest can roughly split into two categories. One being functional water points while the other being non-functional water points.

X.status_clean frequency percentage cumulative_perc
1 Functional 2232 41.94 41.94
2 Non-Functional 1894 35.59 77.53
3 <NA> 734 13.79 91.32
4 Functional but needs repair 236 4.43 95.75
5 Non-Functional due to dry season 146 2.74 98.49
6 Functional but not in use 61 1.15 99.64
7 Abandoned 15 0.28 99.92
8 Abandoned/Decommissioned 4 0.08 100.00
We don’t want our final chart to have “NA” just as it is. It will not be very readable for our audience.
wp_sf_nga <- wp_sf %>%
rename(status_clean = 'X.status_clean') %>%
dplyr::select(status_clean) %>%
mutate(status_clean = replace_na(
status_clean, "unknown" # rename NA to be called "unknown"
))
funModeling::freq(data = wp_sf_nga,
input = 'status_clean')
status_clean frequency percentage cumulative_perc
1 Functional 2232 41.94 41.94
2 Non-Functional 1894 35.59 77.53
3 unknown 734 13.79 91.32
4 Functional but needs repair 236 4.43 95.75
5 Non-Functional due to dry season 146 2.74 98.49
6 Functional but not in use 61 1.15 99.64
7 Abandoned 15 0.28 99.92
8 Abandoned/Decommissioned 4 0.08 100.00
6.1 Extract Water Point Data
In the two code chunks below, we focus on splitting the functional, non-functional and unknown water points so further analysis can be done on the different groups.
6.2 Performing Point-in-Polygon Count
Using st_intersects(), we want to know if the geospatial coordinates of Osun intersect with the geospatial coordinates of the water points. If yes, the dataframe NGA_wp will get the total number of functional, non-functional, unknown and overall water points in each 2nd administrative zone.
NGA_wp <- NGA %>%
mutate('total_wp' = lengths(
st_intersects(NGA, wp_sf_nga))) %>%
mutate('wp_functional' = lengths(
st_intersects(NGA, wp_functional))) %>%
mutate('wp_nonfunctional' = lengths(
st_intersects(NGA, wp_nonfunctional))) %>%
mutate('wp_unknown' = lengths(
st_intersects(NGA, wp_unknown)))
head(NGA_wp, 5)Simple feature collection with 5 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 189625.6 ymin: 338755.8 xmax: 272238.5 ymax: 447175.5
Projected CRS: Minna / Nigeria Mid Belt
ADM2_EN ADM2_PCODE ADM1_EN ADM1_PCODE geometry
1 Aiyedade NG030001 Osun NG030 MULTIPOLYGON (((213526.6 34...
2 Aiyedire NG030002 Osun NG030 MULTIPOLYGON (((212542.6 40...
3 Atakumosa East NG030003 Osun NG030 MULTIPOLYGON (((265746.8 37...
4 Atakumosa West NG030004 Osun NG030 MULTIPOLYGON (((248871.4 40...
5 Boluwaduro NG030005 Osun NG030 MULTIPOLYGON (((266092.2 43...
total_wp wp_functional wp_nonfunctional wp_unknown
1 389 157 154 78
2 175 89 57 29
3 223 98 92 33
4 246 111 103 32
5 129 63 51 15
From the chart below, we can see that most of LGAs have around 100 to 150 water points
ggplot(data = NGA_wp,
aes(x = total_wp)) +
geom_histogram(bins=20,
color="black",
fill="light blue") +
geom_vline(aes(xintercept=mean(
total_wp, na.rm=T)),
color="red",
linetype="dashed",
size=0.8) +
ggtitle("Distribution of total water points by LGA") +
xlab("No. of water points") +
ylab("No. of\nLGAs") +
theme(axis.title.y=element_text(angle = 0))
ggplot(data = NGA_wp,
aes(x = wp_functional)) +
geom_histogram(bins=20,
color="black",
fill="dark blue") +
geom_vline(aes(xintercept=mean(
wp_functional, na.rm=T)),
color="red",
linetype="dashed",
size=0.8) +
ggtitle("Distribution of functional water points by LGA") +
xlab("No. of water points") +
ylab("No. of\nLGAs") +
theme(axis.title.y=element_text(angle = 0))
ggplot(data = NGA_wp,
aes(x = wp_nonfunctional)) +
geom_histogram(bins=20,
color="black",
fill="orange") +
geom_vline(aes(xintercept=mean(
wp_nonfunctional, na.rm=T)),
color="red",
linetype="dashed",
size=0.8) +
ggtitle("Distribution of non-functional water points by LGA") +
xlab("No. of water points") +
ylab("No. of\nLGAs") +
theme(axis.title.y=element_text(angle = 0))
ggplot(data = NGA_wp,
aes(x = wp_unknown)) +
geom_histogram(bins=20,
color="black",
fill="grey") +
geom_vline(aes(xintercept=mean(
wp_unknown, na.rm=T)),
color="red",
linetype="dashed",
size=0.8) +
ggtitle("Distribution of unknown water points by LGA") +
xlab("No. of water points") +
ylab("No. of\nLGAs") +
theme(axis.title.y=element_text(angle = 0))
7 First-order Spatial Point Patterns Analysis
Creating a simple interactive map, we can easily see that most of the water points are located towards the north of Osun, leaving the south with little. This is in spite just zooming out a little, the south is closer to the sea.
tmap_mode("view")
tm_shape(wp_functional) +
tm_dots(col = "status_clean",
pal = "darkblue",
title = "Functional") +
tm_shape(wp_nonfunctional) +
tm_dots(col = "status_clean",
pal = "orange",
title = "Non-Functional") +
tm_view(set.zoom.limits = c(5,25),
set.view = 9) Looking at the functional and non-functional water points separately, both maps reflect the same trend as the above map where the water points are largely located towards the north.
wp_functional_map <- tm_shape(NGA) +
tm_fill() +
tm_borders() +
tm_shape(wp_functional) +
tm_dots(col = "status_clean",
title = "Functional",
palette = "darkblue",
legend.show = FALSE) +
tm_layout(title="Functional") +
tm_view(set.view = 9)
wp_nonfunctional_map <- tm_shape(NGA) +
tm_fill() +
tm_borders() +
tm_shape(wp_nonfunctional) +
tm_dots(col = "status_clean",
title = "Non-Functional",
palette = "orange",
legend.show = FALSE) +
tm_layout(title="Non-Functional") +
tm_view(set.view = 9)
tmap_arrange(wp_functional_map, wp_nonfunctional_map,
asp = 1,
ncol = 2)
As spatstat needs its inputs to be of ppp object type, we need to convert the Simple Feature DataFrame to ppp, which will be need to go through a few conversion.
7.1 Conversion of Datatypes
7.1.1 Converting sf data frames to sp’s Spatial* class
7.1.2 Converting sp’s Spatial* Class into Generic sp Format
7.1.3 Converting Generic sp Format into spatstat’s ppp Format
Below shows the results of both functional and non-functional water points, using ppp object.
7.1.4
7.2 Check for Duplicate Data Points
Duplicated data points should be removed as processes under spatial point patterns analysis are largely assumed to be simple (i.e., no duplicated data).
7.3
7.4 Creating owin Object
An owin object will be created to assist us in confining the data points to only Osun.
7.5 Kernel Density Estimation (KDE)
We will need to rescale the KDE values as the current values in meters will not be easily understood.
7.5.1 Computing KDE using Automatic Bandwidth Selection Method
bw.ppl() is chosen as it highlights clusters more clearly compared to bw.diggle(), while being not misleading as the huge area highlighted resulted from using bw.CvL() or bw.scott().
7.5.2 Converting KDE Output into Grid Object
7.5.3 Converting Gridded Output into Raster
kde_wp_nonfunctional_bw_raster <- raster(gridded_wp_nonfunctional_bw)
projection(kde_wp_nonfunctional_bw_raster) <- CRS("+init=EPSG:26392")
kde_wp_nonfunctional_bw_rasterclass : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045 (x, y)
extent : 176.5032, 291.0438, 331.4347, 454.5201 (xmin, xmax, ymin, ymax)
crs : +init=EPSG:26392
source : memory
names : v
values : -1.306349e-15, 8.406528 (min, max)
7.5.4 Visualising the Output in tmap
8 Second-order Spatial Point Patterns Analysis
8.1 Analysing Spatial Point Process Using L-Function
To conduct our second-order spatial point patterns analysis, we are opting to use Besag’s L-Function. It is a normalised Ripley’s K-Function, which measures the distances between a point and its neighbours within each radius.
Using the envelope() of spatstat package, Monte Carlo simulations have be developed to observe the randomness of the water points.
8.1.1 Functional Water Points
8.1.1.1 Computing L-function estimation

8.1.1.2 Performing Complete Spatial Randomness Test
Ho = The distribution of functional water points at Osun are randomly distributed.
H1= The distribution of functional water points at Osun are not randomly distributed.

8.1.2 Non-Functional Water Points
8.1.2.1 Computing L-function estimation

8.1.2.2 Performing Complete Spatial Randomness Test
Ho = The distribution of non-functional water points at Osun are randomly distributed.
H1= The distribution of non-functional water points at Osun are not randomly distributed.

9 Spatial Correlation Analysis
wp_functional["functionality"] = "functional"
wp_nonfunctional["functionality"] = "nonfunctional"
wp_nonunknown <- rbind(wp_functional, wp_nonfunctional)
head(wp_nonunknown, 5)Simple feature collection with 5 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 212202 ymin: 349210.1 xmax: 270497.9 ymax: 403822.5
Projected CRS: Minna / Nigeria Mid Belt
status_clean functionality geometry
1 Functional but needs repair functional POINT (212810 386707.6)
8 Functional functional POINT (228798.9 403822.5)
28 Functional functional POINT (270497.9 377476.9)
1.1 Functional functional POINT (212202 349210.1)
18 Functional functional POINT (259331.9 399591.4)
tmap_mode("view")
tm_shape(NGA) +
tm_polygons() +
tm_shape(LCLQ_wp) +
tm_dots(col = "nonfunctional",
size = 0.01,
pal = c("orange"),
border.lwd = 0.5) +
tm_view(set.zoom.limits = c(5,25),
set.view = 9) Analysing Spatial Point Process Using L-Function
9.1 Conversion of Datatypes
9.1.1 Converting sf data frames to sp’s Spatial* class
9.1.2 Converting sp’s Spatial* Class into Generic sp Format
9.1.3 Converting Generic sp Format into spatstat’s ppp Format
Below shows the results of both functional and non-functional water points, using ppp object.
L_LCLQ_wp = Lest(LCLQ_wp_ppp, correction = "Ripley")
plot(L_LCLQ_wp, . -r ~ r,
ylab= "L(d)-r", xlab = "d(m)", main="LCLQ of Water Points")
Generating 39 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39.
Done.








